## Project Summary

We are an interdisciplinary research team composed of specialists in artificial intelligence and biology who collaborate to apply artificial intelligence and image processing techniques to the monitoring of cyanobacterial blooms in Argentina.

Cyanobacterial blooms are massive proliferations of aquatic microorganisms that represent a significant ecological, public health, and economic concern worldwide. These events frequently affect freshwater ecosystems and can pose risks to drinking water supplies, recreational activities, and aquatic biodiversity.



## Dataset Description

Within the framework of this project, our objective is to develop automated algorithms capable of detecting and analyzing cyanobacteria in environmental samples. To this end, we created a dataset of high-resolution microscopy images of filamentous bloom-forming cyanobacteria collected from shallow lakes in the Pampas region of Buenos Aires Province, Argentina.

This dataset was designed to support research on automated species detection and classification using computer vision and deep learning approaches. The images were obtained from environmental samples collected during cyanobacterial bloom events and represent locally generated data from freshwater ecosystems in the region.



## Research Applications

Beyond taxonomic identification, this project also aims to enable the automated estimation of cell abundance and biovolume. These measurements are essential for assessing the potential risks associated with different human uses of water bodies.

Currently, these analyses typically require specialized personnel and involve labor-intensive and time-consuming laboratory procedures. This limitation becomes particularly critical during public health events associated with cyanobacterial blooms, when timely and reliable information is needed to support monitoring and decision-making processes.

By providing this dataset, we seek to facilitate the development and evaluation of automated image analysis methods that can help researchers and environmental monitoring agencies improve the detection and quantification of bloom-forming cyanobacteria.



## Synthetic Inference Dataset

In addition to environmental microscopy images, the dataset includes 150 synthetically generated images created using image processing techniques. These images are intended solely for model inference and robustness evaluation in production-like scenarios.

The synthetic images simulate controlled variations in cyanobacterial structures and environmental conditions in order to evaluate the behavior of trained models during deployment.

They are provided exclusively to support post-training inference experiments and to assess model generalization under simulated conditions.



## Dataset Statistics

- Image type: microscopy images  
- Target organisms: filamentous bloom-forming cyanobacteria  
- Region: Pampean shallow lakes, Buenos Aires Province, Argentina  
- Image resolution: 1536 × 2048 pixels  
- Real microscopy images: 382  
- Synthetic images (inference-only set): 150  
- Annotation type: multi-label image classification  



## Potential Applications

This dataset can support research and development in several areas including:
- Automated detection of bloom-forming cyanobacteria  
- Multi-label classification of microscopy images  
- Bioimage analysis using deep learning  
- Environmental monitoring using computer vision  
- Development of AI-assisted tools for freshwater ecosystem assessment